rank | frequency | n-gram |
---|---|---|
1 | 27041 | -o |
2 | 24281 | -e |
3 | 21182 | -a |
4 | 20041 | -i |
5 | 3985 | -” |
rank | frequency | n-gram |
---|---|---|
1 | 6337 | -to |
2 | 5384 | -no |
3 | 5063 | -re |
4 | 4925 | -te |
5 | 4812 | -ti |
rank | frequency | n-gram |
---|---|---|
1 | 3335 | -one |
2 | 2617 | -ato |
3 | 2357 | -ano |
4 | 2264 | -are |
5 | 2099 | -nte |
rank | frequency | n-gram |
---|---|---|
1 | 2733 | -ione |
2 | 1386 | -ente |
3 | 1129 | -ento |
4 | 969 | -ando |
5 | 801 | -ioni |
rank | frequency | n-gram |
---|---|---|
1 | 2139 | -zione |
2 | 968 | -mento |
3 | 796 | -mente |
4 | 580 | -zioni |
5 | 477 | -ranno |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings